sea's blog → Algebra, Lisp, and miscellaneous thoughts

Table of Contents

Fuck Git - Start using Pijul

At present the software industry appears to have finally standardized on git. It did seem for a while there that mercurial was starting to gain traction, though.

However, git is just badly designed. It's built with the philosophy of: Get something that works, and screw having an elegant theory behind it.

Moreover, git's interface is absolutely abhorrent, and confusing.

Actually patch-based instead of faking it

The CLI interface for git gives the impression that git is patch-based. Operations depend on commits, which as their primary representation present to you a diff.

It would be most rational to believe that the underlying data model for git is that of patches, but that's wrong. git's underlying data model is commit snapshot blobs, and it uses data compression to 'fake' the patch-based appearance by pure coincidence..

This conceptual mismatch makes understanding git even harder. As a prerequisite to understanding how git really works under the hood, you need to first throw away the intuitive idea of patches, and think in terms of compressed snapshots. It's disgusting.

Pijul, on the other hand, actually is patch-based. The underlying data model actually uses patches as the first class objects, and that makes it possible to trivially reason about how the program actually works and be correct about it.

Provably Correct Merges

git's merge algorithm is broken. It's heuristic, so most of the time, it works. Sometimes, it fails. The actual algorithm was never proven correct, and now that it's adopted widely there isn't really a way to go back and fix it.

The classic example of this is the duplicated addition merge.

It is actually pretty rare for git to silently fail a merge, breaking the code in subtle ways, but it can happen, and it HAS happened in production in large companies before.

Stacking repeated changes isn't painful

In git, if you attempt to cherry-pick changes that are already present in your tree, you can actually get some really screwed-up conflict-resolution phases that are outrageously complicated.

In pijul, you can do this just fine. A change is a change, and you can include it or not. In git, a change is not a change, a change is a representation of some underlying snapshot in a linear(ish) array of snapshots. You can pick two of 'em, and they're ordered. Apply them in the wrong order and you get a conflict. Apply them the other way around and it might work. Git is order-dependent on commits and while that doesn't come up in practice very often, it can come up often enough to irritate the hell out of you, especially if you collaborate with others and need to be merging and fixing things, increasing the odds of these sorts of nightmares occurring.

Conflicts are first-class

In git, a conflict is a fuck-up. You need to fix a conflict IMMEDIATELY, and you can't proceed until you do. Git models conflicts as broken states that require user intervention.

In pijul, in contrast, a conflict is a first-class object. If you have two conflicting patches, you can keep 'em both around just fine. The conflict is something that you can keep in your todo list to resolve later when you get around to it.

You don't have to drop everything to fix a conflicting merge right now. That's great.

No faking history

In git, you have to always be rewriting the history and faking commits. Every so often when it comes time to make a MR you have to clean up your branch, and that involves faking the history. You erase all that was there before, squash it up, and make new fresh commit messages.

That makes you look very nice when the history is reviewed, but it's not real. That's not how the development actually happened, and pedagogically, you aren't going to be teaching junior developers the right message when you hide your tracks behind you.

Junior developers need to see the false starts and failed attempts and the fact that seniors make mistakes, too. In git, all of that is hidden away and people pretend to have done it perfectly the first time around.

I believe that git's culture of dishonest coding is dangerous, and that, even if we don't switch to pijul, that we should really do something to address this. We need less dishonesty in the world, not more. Even if it turns out that leaving the true history in the repository makes it messy and cluttered, so be it, fuck it. That's what the reality is and I am comfortable with that: Development is messy and I am not afraid to hide that.

The theory is elegant and verifiable

Pijul is based on a mathematically clean patch algebra (written out in a bunch of pages of category theory). It has a solid foundation, and it is the avatar of what true software engineering should be: A program based on elegant theory, not git, which is a pragmatic "it works for me" handyman sort of "I did this in my shed and it's good enough".

git wasn't engineered, it was just thrown together and duct taped to make it go. The fact that it works is very lucky for the software industry, but let's not hold onto the duct-taped tool longer than we have to. This isn't what we want our field to be. We want our field to be formalized and rigorous, don't we? Git isn't that. Pijul is.